A Coherent Biomedical Literature Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations
نویسندگان
چکیده
In this paper, we introduce a coherent biomedical literature clustering and summarization approach that employs a graphical representation method for text using a biomedical ontology. The key of the approach is to construct document cluster models as semantic chunks capturing the core semantic relationships in the ontology-enriched scale-free graphical representation of documents. These document cluster models are used for both document clustering and text summarization by constructing Text Semantic Interaction Network (TSIN). Our extensive experimental results indicate our approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. The primary contribution of this paper is we introduce a coherent biomedical literature clustering and summarization approach that takes advantage of ontologyenriched graphical representations. Our approach significantly improves the quality of document clusters and understandability of documents through summaries.
منابع مشابه
Mining and its Application in Biomedical Domain
Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, ...
متن کاملA Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information
In this article, we present a graph-based knowledge representation for biomedical digital library literature clustering. An efficient clustering method is developed to identify the ontology-enriched k-highest density term subgraphs that capture the core semantic relationship information about each document cluster. The distance between each document and the k term graph clusters is calculated. ...
متن کاملSpatiotemporal integration of molecular and anatomical data in virtual reality using semantic mapping
We have developed a computational framework for spatiotemporal integration of molecular and anatomical datasets in a virtual reality environment. Using two case studies involving gene expression data and pharmacokinetic data, respectively, we demonstrate how existing knowledge bases for molecular data can be semantically mapped onto a standardized anatomical context of human body. Our data mapp...
متن کاملSaliency Cognition of Urban Monuments Based on Verbal Descriptions of Mental-Spatial Representations (Case Study: Urban Monuments in Qazvin)
Urban monuments encompass a wide range of architectural works either intentionally or unintentionally. These works are often salient due to their inherently explicit or hidden components and qualities in the urban context. Therefore, they affect the mental-spatial representations of the environment and make the city legible. However, the ambiguity of effective components often complicates their...
متن کاملClustering Large Collection of Biomedical Literature Based on Ontology-Enriched Bipartite Graph Representation and Mutual Refinement Strategy
In this paper we introduce a novel document clustering approach that solves some major problems of traditional document clustering approaches. Instead of depending on traditional vector space model, this approach represents a set of documents as bipartite graphs using domain knowledge in ontology. In this representation, the concepts of the documents are classified according to their relationsh...
متن کامل